AU180 




PAGE: 1 RAW SEQUENCE LISTING DATE: 12/10/93 

PATENT APPLICATION US/08/070,455 TIME: 11:58:59 

INPUT SET: S2078.raw 

1 SEQUENCE LISTING 
2 

3 ( 1 ) General Information : 
4 

5 (i) APPLICANT: HOFVANDER, Per 

6 PERSSON, Per T 

7 WIKSTROM, Olle 

8 TALLBERG, Anneli 
9 

10 (ii) TITLE OF INVENTION: GENETICALLY ENGINEERED MODIFICATION OF 

11 POTATO TO FORM AMYLOPECT IN - TYPE STARCH 
12 

13 (iii) NUMBER OF SEQUENCES: 21 
14 

15 (iv) CORRESPONDENCE ADDRESS: 

16 (A) ADDRESSEE: Burns, Doane, Swecker & Mathis 

17 (B) STREET: George Mason Bldg., Washington & Prince Sts . 

18 (C) CITY: Alexandria 

19 (D) STATE: Virginia 

20 (E) COUNTRY: United States 

21 (F) ZIP : 22313-1404 
22 

23 (v) COMPUTER READABLE FORM: 

24 (A) MEDIUM TYPE: Floppy disk 

25 (B) COMPUTER: IBM PC compatible 

26 (C) OPERATING SYSTEM: PC-DOS/MS-DOS 

27 (D) SOFTWARE: Patentln Release #1.0, Version #1.25 
28 

2 9 (vi) CURRENT APPLICATION DATA: 

30 (A) APPLICATION NUMBER: US 08/070,455 

31 (B) FILING DATE: 09-JUN-1993 

32 (C) CLASSIFICATION: 
33 

34 (viii) ATTORNEY/AGENT INFORMATION: 

35 (A) NAME: Crane-Feury, Sharon E 

36 (B) REGISTRATION NUMBER: 36,113 

37 (C) REFERENCE /DOCKET NUMBER: 003300-293 
38 

39 (ix) TELECOMMUNICATION INFORMATION: 

40 (A) TELEPHONE: (703) 836-6620 

41 (B) TELEFAX: (703) 836-2021 
42 
43 

44 (2) INFORMATION FOR SEQ ID NO : 1 : 
45 

46 (i) SEQUENCE CHARACTERISTICS: 

47 (A) LENGTH: 342 base pairs 

48 (B) TYPE: nucleic acid 

49 (C) STRANDEDNESS : double 

50 (D) TOPOLOGY: linear 
51 
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RAW SEQUENCE LISTING 

PATENT APPLICATION US/08/070,455 



DATE: 12/10/93 
TIME: 11:59:10 



INPUT SET; S2078.raw 



52 (ii) MOLECULE TYPE: DNA (genomic) 

53 

54 (ix) FEATURE: 

55 (A) NAME / KEY : CDS 

56 (B) LOCATION: 217.. 342 
57 

58 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

59 

60 TGCATGTTTC CCTACATTCT ATTTAGAATC GTGTTGTGGT GTATAAACGT TGTTTCATAT 60 
61 

62 CTCATCTCAT CTATTCTGAT TTTGATTCTC TTGCCTACTG TAATCGGTGA TAAATGTGAA 120 
63 

64 TGCTTCCTTT CTTCTCAGAA ATCAATTTCT GTTTTGTTTT TGTTCATCTG TAGCTTATTC 180 
65 

66 TCTGGTAGAT TCCCCTTTTT GTAGACCACA CATCAC ATG GCA AGC ATC ACA GCT 234 

67 Met Ala Ser He Thr Ala 

68 15 
69 

70 TCA CAC CAC TTT GTG TCA AGA AGC CAA ACT TCA CTA GAC ACC AAA TCA 282 

71 Ser His His Phe Val Ser Arg Ser Gin Thr Ser Leu Asp Thr Lys Ser 

72 10 15 20 
73 

74 ACC TTG TCA CAG ATA GGA CTC AGG AAC CAT ACT CTG ACT CAC AAT GGT 330 

75 Thr Leu Ser Gin He Gly Leu Arg Asn His Thr Leu Thr His Asn Gly 

76 25 30 35 
77 

78 TTA AGG GCT GTT 342 

79 Leu Arg Ala Val 

80 40 
81 

82 

83 (2) INFORMATION FOR SEQ ID NO : 2 : 
84 

85 (i) SEQUENCE CHARACTERISTICS: 

86 (A) LENGTH: 2549 base pairs 

87 (B) TYPE: nucleic acid 

88 (C) STRANDEDNESS: single 
8 9 (D) TOPOLOGY: linear 

90 

91 (ii) MOLECULE TYPE: DNA (genomic) 

92 

93 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 



94 

95 AACAAGCTTG ATGGGCTCCA ATCAACAACT AATACTAAGG TAACACCCAA GATGGCATCC 60 
96 

97 AGAACTGAGA CCAAGAGACC TGGATGCTCA GCTACCATTG TTTGTGGAAA GGGAATGAAC 120 
98 

99 TTGATCTTTG TGGGTACTGA GGTTGGTCCT TGGAGCAAAA CTGGTGGACT AGGTGATGTT 180 
100 

101 CTTGGTGGAC TACCACCAGC CCTTGCAGTA AGTCTTTCTT TCATTTGGTT ACCTACTCAT 240 
102 



3^ 



PAGE: 3 RAW SEQUENCE LISTING DATE: 12/10/93 

PATENT APPLICATION US/08/070,455 TIME: 11:59:21 

INPUT SET: S2078.raw 

103 TCATTACTTA TTTTGTTTAG TTAGTTTCTA CTGCATCAGT CTTTTTATCA TTTAGGCCCG 300 
104 

105 CGGACATCGG GTAATGACAA TATCCCCCCG TTATGACCAA TACAAAGATG CTTGGGATAC 360 
106 

107 TGGCGTTGCG GTTGAGGTAC ATCTTCCTAT ATTGATACGG TACAATATTG TTCTCTTACA 420 
108 

109 TTTCCTGATT CAAGAATGTG ATCATCTGCA GGTCAAAGTT GGAGACAGCA TTGAAATTGT 480 
110 

111 TCGTTTCTTT CACTGCTATA AACGTGGGGT TGATCGTGTT TTTGTTGACC ACCCAATGTT 540 
112 

113 CTTGGAGAAA GTAAGCATAT TATGATTATG AATCCGTCCT GAGGGATACG CAGAACAGGT 600 
114 

115 CATTTTGAGT ATCTTTTAAC TCTACTGGTG CTTTTACTCT TTTAAGGTTT GGGGCAAAAC 660 
116 

117 TGGTTCAAAA ATCTATGGCC CCAAAGCTGG ACTAGATTAT CTGGACAATG AACTTAGGTT 720 
118 

119 CAGCTTGTTG TGTCAAGTAA GTTAGTTACT CTTGATTTTT ATGTGGCATT TTACTCTTTT 780 
120 

121 GTCTTTAATC GTTTTTTTAA CCTTGTTTTC TCAGGCAGCC CTAGAGGCAC CTAAAGTTTT 840 
122 

123 GAATTTGAAC AGTAGCAACT ACTTCTCAGG ACCATATGGT AATTAACACA TCCTAGTTTC 900 
124 

12 5 AGAAAACTCC TTACTATATC ATTGTAGGTA ATCATCTTTA TTTTGCCTAT TCCTGCAGGA 960 
126 

12 7 GAGGATGTTC TCTTCATTGC CAATGATTGG CACACAGCTC TCATTCCTTG CTACTTGAAG 1020 
128 

12 9 TCAATGTACC AGTCCAGAGG AATCTACTTG AATGCCAAGG TAAAATTTCT TTGTATTCAC 1080 
130 

131 TCGATTGCAC GTTACCCTGC AAATCAGTAA GGTTGTATTA ATATATGATA AATTTCACAT 1140 
132 

133 TGCCTCCAGG TTGCTTTCTG CATCCATAAC ATTGCCTACC AAGGTCGATT TTCTTTCTCT 1200 
134 

135 GACTTCCCTC TTCTCAATCT TCCTGATGAA TTCAGGGGTT CTTTTGATTT CATTGATGGG 1260 
136 

13 7 TATGTATTTA TGCTTGAAAT CAGACCTCCA ACTTTTGAAG CTCTTTTGAT GCTAGTAAAT 1320 
138 

13 9 TGAGTTTTTA AAATTTTGCA GATATGAGAA GCCTGTTAAG GGTAGGAAAA TCAACTGGAT 1380 
140 

141 GAAGGCTGGG ATATTAGAAT CACATAGGGT GGTTACAGTG AGCCCATACT ATGCCCAAGA 1440 
142 

143 ACTTGTCTCT GCTGTTGACA AGGGAGTTGA ATTGGACAGT GTCCTTCGTA AGACTTGCAT 1500 
144 

14 5 AACTGGGATT GTGAATGGCA TGGATACACA AGAGTGGAAC CCAGCGACTG ACAAATACAC 1560 
146 

147 AGATGTCAAA TACGATATAA CCACTGTAAG ATAAGATTTT TCCGACTCCA GTATATACTA 1620 
148 

149 AATTATTTTG TATGTTTATG AAATTAAAGA GTTCTTGCTA ATCAAAATCT CTATACAGGT 1680 
150 

151 CATGGACGCA AAACCTTTAC TAAAGGAGGC TCTTCAAGCA GCAGTTGGCT TGCCTGTTGA 1740 
152 

153 CAAGAAGATC CCTTTGATTG GCTTCATCGG CAGACTTGAG GAGCAGAAAG GTTCAGATAT 1800 
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RAW SEQUENCE LISTING 

PATENT APPLICATION US/08/070,455 



DATE: 12/10/93 
TIME: 11:59:32 



INPUT SET: S2078.raw 

154 

155 TCTTGTTGCT GCAATTCACA AGTTCATCGG ATTGGATGTT CAAATTGTAG TCCTTGTAAG 
156 

157 TACCAAATGG ACTCATGGTA TCTCTCTTGT TGAGTTTACT TGTGCCGAAA CTGAAATTGA 
158 

159 CCTGCTACTC ATCCTATGCA TCAGGGAACT GGCAAAAAGG AGTTTGAGCA GGAGATTGAA 
160 

161 CAGCTCGAAG TGTTGTACCC TAACAAAGCT AAAGGAGTGG CAAAATTCAA TGTCCCTTTG 
162 

163 GCTCACATGA TCACTGCTGG TGCTGATTTT ATGTTGGTTC CAAGCAGATT TGAACCTTGT 
164 

165 GGTCTCATTC AGTTACATGC TATGCGATAT GGAACAGTAA GAACCAGAAG AGCTTGTACC 
166 

167 TTTTTACTGA GTTTTTAAAA AAAGAATCAT AAGACCTTGT TTTCCATCTA AAGTTTAATA 
168 

169 ACCAACTAAA TGTTACTGCA GCAAGCTTTT CATTTCTGAA AATTGGTTAT CTGATTTTAA 
170 

171 CGTAATCACA TGTGAGTCAG GTACCAATCT GTGCATCGAC TGGTGGACTT GTTGACACTG 
172 

173 TGAAAGAAGG CTATACTGGA TTCCATATGG GAGCCTTCAA TGTTGAAGTA TGTGATTTTA 
174 

175 CATCAATTGT GTACTTGTAC ATGGTCCATT CTCGTCTTGA TATACCCCTT GTTGCATAAA 
176 

17 7 CATTAACTTA TTGCTTCTTG AATTTGGTTA GTGCGATGTT GTTGACCCAG CTGATGTGCT 
178 

179 TAAGATAGTA ACAACAGTTG CTAGAGCTC 
180 
181 



182 (2) INFORMATION FOR SEQ ID NO: 3: 
183 

184 (i) SEQUENCE CHARACTERISTICS: 

185 (A) LENGTH: 492 base pairs 

186 (B) TYPE: nucleic acid 

187 (C) STRANDEDNESS : double 

188 ( D ) TOPOLOGY : 1 inear 
189 

190 (ii) MOLECULE TYPE: DNA (genomic) 
191 

192 (ix) FEATURE: 

193 (A) NAME /KEY : CDS 

194 (B) LOCATION: 1 . . 15 
195 

196 (ix) FEATURE: 

197 (A) NAME /KEY: CDS 

198 (B) LOCATION: 101.. 218 
199 

200 

201 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 
202 

203 GAG CTC TCC TGG AAG GTAAGTGTGA ATTTGATAAT TTGCGTAGGT ACTTCAGTTT 55 

204 Glu Leu Ser Trp Lys 




1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2549 
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205 1 5 

206 

207 GTTGTTCTCG TCAGCACTGA TGGATTCCAA CTGGTGTTCT TGCAG GAA CCT GCC 109 

208 Glu Pro Ala 

209 1 
210 

211 AAG AAA TGG GAG ACA TTG CTA TTG GGC TTA GGA GCT TCT GGC AGT GAA 157 

212 Lys Lys Trp Glu Thr Leu Leu Leu Gly Leu Gly Ala Ser Gly Ser Glu 

213 5 10 15 
214 

215 CCC GGT GTT GAA GGG GAA GAA ATC GCT CCA CTT GCC AAG GAA AAT GTA 205 

216 Pro Gly Val Glu Gly Glu Glu lie Ala Pro Leu Ala Lys Glu Asn Val 

217 20 25 30 35 
218 

219 GCC ACT CCT TAAATGAGCT TTGGTTATCC TTGTTTCAAC AATAAGATCA 254 

220 Ala Thr Pro * 
221 

222 

223 TTAAGCAAAC GTATTTACTA GCGAACTATG TAGAACCCTA TTATGGGGTC TCAATCATCT 314 
224 

225 ACAAAATGAT TGGTTTTTGC TGGGGAGCAG CAGCATATAA GGCTGTAAAA TCCTGGTTAA 374 
226 

22 7 TGTTTTTGTA GGTAAGGGCT ATTTAAGGTG GTGTGGATCA AAGTCAATAG AAAATAGTTA 434 
228 

22 9 TTACTAACGT TTGCAACTAA ATACTTAGTA ATGTAGCATA AATAATACTA GAACTAGT 492 
230 

231 

232 (2) INFORMATION FOR SEQ ID NO : 4 : 
233 

234 (i) SEQUENCE CHARACTERISTICS: 

23 5 (A) LENGTH: 987 base pairs 

236 (B) TYPE: nucleic acid 

237 (C) STRANDEDNESS : single 

238 (D) TOPOLOGY: linear 
239 

240 (ii) MOLECULE TYPE: DNA (genomic) 

241 

242 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 

243 

244 AAGCTTTAAC GAGATAGAAA ATTATGTTAC TCCGTTTTGT TCATTACTTA ACAAATGCAA 60 
245 

246 CAGTATCTTG TACCAAATCC TTTCTCTCTT TTCAAACTTT TCTATTTGGC TGTTGACGGA 120 
247 

248 GTAATCAGGA TACAAACCAC AAGTATTTAA TTGACTCCTC CGCCAGATAT TATGATTTAT 180 
249 

250 GAATCCTCGA AAAGCCTATC CATTAAGTCC TCATCTATGG ATATACTTGA CAGTATCTTC 240 
251 

252 CTGTTTGGGT ATTTTTTTTT CCTGCCAAGT GGAACGGAGA CATGTTATGA TGTATACGGG 3 00 

253 

254 AAGCTCGTTA AAAAAAAATA CAATAGGAAG AAATGTAACA AACATTGAAT GTTGTTTTTA 360 
255 



3' 5 
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TIME: 11:59:54 



INPUT SET: S2078.raw 

256 ACCATCCTTC CTTTAGCAGT GTATCAATTT TGTAATAGAA CCATGCATCT CAATCTTAAT 42 0 

257 

258 ACTAAAATGC AACTTAATAT AGGCTAAACC AAGATAAAGT AATGTATTCA ACCTTTAGAA 480 
259 

260 TTGTGCATTC ATAATTAGAT CTTGTTTGTC GTAAAAAATT AGAAAATATA TTTACAGTAA 540 
261 

262 TTTGGAATAC AAAGCTAAGG GGGAAGTAAC TAATATTCTA GTGGAGGGAG GGACCAGTAC 600 
263 

264 CAGTACCTAG ATATTATTTT TAATTACTAT AATAATAATT TAATTAACAC GAGACATAGG 660 
265 

266 AATGTCAAGT GGTAGCGTAG GAGGGAGTTG GTTTAGTTTT TTAGATACTA GGAGACAGAA 720 
267 

268 CCGGACGGCC CATTGCAAGG CCAAGTTGAA GTCCAGCCGT GAATCAACAA AGAGAGGGCC 780 
269 

2 70 CATAATACTG TCGATGAGCA TTTCCCTATA ATACAGTGTC CACAGTTGCC TTCTGCTAAG 840 
271 

272 GGATAGCCAC CCGCTATTCT CTTGACACGT GTCACTGAAA CCTGCTACAA ATAAGGCAGG 900 
273 

2 74 CACCTCCTCA TTCTCACTCA CTCACTCACA CAGCTCAACA AGTGGTAACT TTTACTCATC 960 
275 

2 76 TCCTCCAATT ATTTCTGATT TCATGCA 987 

277 

278 



279 (2) INFORMATION FOR SEQ ID NO : 5 : 
280 

281 (i) SEQUENCE CHARACTERISTICS: 

282 (A) LENGTH: 4964 base pairs 

283 (B) TYPE: nucleic acid 

284 (C) STRANDEDNESS: single 

285 (D) TOPOLOGY: linear 
286 

287 (ii) MOLECULE TYPE: DNA (genomic) 
288 

289 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 



290 

2 91 AAGCTTTAAC GAGATAGAAA ATTATGTTAC TCCGTTTTGT TCATTACTTA ACAAATGCAA 60 
292 

2 93 CAGTATCTTG TACCAAATCC TTTCTCTCTT TTCAAACTTT TCTATTTGGC TGTTGACGGA 120 
294 

2 95 GTAATCAGGA TACAAACCAC AAGTATTTAA TTGACTCCTC CGCCAGATAT TATGATTTAT 180 
296 

297 GAATC CTCGA AAAGCCTATC CATTAAGTCC TCATCTATGG ATATACTTGA CAGTATCTTC 240 
298 

299 CTGTTTGGGT ATTTTTTTTT CCTGCCAAGT GGAACGGAGA CATGTTATGA TGTATACGGG 300 
300 

301 AAGCTCGTTA AAAAAAAATA CAATAGGAAG AAATGTAACA AACATTGAAT GTTGTTTTTA 360 
302 

3 03 ACCATCCTTC CTTTAGCAGT GTATCAATTT TGTAATAGAA CCATGCATCT CAATCTTAAT 420 
304 

305 ACTAAAATGC AACTTAATAT AGGCTAAACC AAGATAAAGT AATGTATTCA ACCTTTAGAA 480 
306 
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307 TTGTGCATTC ATAATTAGAT CTTGTTTGTC 
308 

309 TTTGGAATAC AAAGCTAAGG GGGAAGTAAC 
310 

311 CAGTACCTAG ATATTATTTT TAATTACTAT 
312 

313 AATGTCAAGT GGTAGCGTAG GAGGGAGTTG 
314 

315 CCGGACGGCC CATTGCAAGG CCAAGTTGAA 
316 

317 CATAATACTG TCGATGAGCA TTTCCCTATA 
318 

319 GGATAGCCAC CCGCTATTCT CTTGACACGT 
320 

321 CACCTCCTCA TTCTCACTCA CTCACTCACA 
322 

323 TCCTCCAATT ATTTCTGATT TCATGCATGT 
324 

325 GGTGTATAAA CGTTGTTTCA TATCTCATCT 
326 

327 CTGTAATCGG TGATAAATGT GAATGCTTCC 
328 

329 TTTTGTTCAT CTGTAGCTTA TTCTCTGGTA 
330 

331 TGGCAAGCAT CACAGCTTCA CACCACTTTG 
332 

333 AATCAACCTT GTCACAGATA GGACTCAGGA 
334 

33 5 CTGTTAACAA GCTTGATGGG CTCCAATCAA 
336 

33 7 CATCCAGAAC TGAGACCAAG AGACCTGGAT 
338 

33 9 TGAACTTGAT CTTTGTGGGT ACTGAGGTTG 
340 

341 ATGTTCTTGG TGGACTACCA CCAGCCCTTG 
342 

343 CTCATTCATT ACTTATTTTG TTTAGTTAGT 
344 

345 GCCCGCGGAC AGCGGGTAAT GACAATATCC 
346 

347 GATACTGGCG TTGCGGTTGA GGTACATCTT 
348 

349 TTACATTTCC TGATTCAAGA ATGTGATCAT 
350 

351 ATTGTTCGTT TCTTTCACTG CTATAAACGT 
352 

353 ATGTTCTTGG AGAAAGTAAG CATATTATGA 
354 

355 CAGGTCATTT TGAGTATCTT TTAACTCTAC 
356 

357 AAAACTGGTT CAAAAATCTA TGGCCCCAAA 



3 



INPUT SET: S2078.raw 

GTAAAAAATT AGAAAATATA TTTACAGTAA 540 

TAATATTCTA GTGGAGGGAG GGACCAGTAC 600 

AATAATAATT TAATTAACAC GAGACATAGG 660 

GTTTAGTTTT TTAGATACTA GGAGACAGAA 720 

GTCCAGCCGT GAATCAACAA AGAGAGGGCC 780 

ATACAGTGTC CACAGTTGCC TTCTGCTAAG 840 

GTCACTGAAA CCTGCTACAA ATAAGGCAGG 900 

CAGCTCAACA AGTGGTAACT TTTACTCATC 960 

TTCCCTACAT TCTATTATGA ATCGTGTTGT 102 0 

CATCTATTCT GATTTTGATT CTCTTGCCTA 108 0 

TTTCTTCTCA GAAATCAATT TCTGTTTTGT 1140 

GATTCCCCTT TTTGTAGACC ACACATCACA 1200 

TGTCAAGAAG CCAAACTTCA CTAGACACCA 1260 

ACCATACTCT GACTCACAAT GGTTTAAGGG 132 0 

CAACTAATAC TAAGGTAACA CCCAAGATGG 138 0 

GCTCAGCTAC CATTGTTTGT GGAAAGGGAA 144 0 

GTCCTTGGAG CAAAACTGGT GGACTAGGTG 1500 

CAGTAAGTCT TTCTTTCATT TGGTTACCTA 1560 

TTCTACTGCA TCAGTCTTTT TATCATTTAG 1620 

CCCCGTTATG ACCAATACAA AGATGCTTGG 1680 

CCTATATTGA TACGGTACAA TATTGTTCTC 1740 

CTGCAGGTCA AAGTTGGAGA CAGCATTGAA 1800 

GGGGTTGATC GTGTTTTTGT TGACCACCCA 186 0 

TTATGAATCC GTCCTGAGGG ATACGCAGAA 192 0 

TGGTGCTTTT ACTCTTTTAA GGTTTGGGGC 198 0 

GCTGGACTAG ATTATCTGGA CAATGAACTT 204 0 



PAGE: 8 



RAW SEQUENCE LISTING 

PATENT APPLICATION US/08/070,455 



DATE: 12/10/93 
TIME: 12:00:16 
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358 

359 AGGTTCAGCT TGTTGTGTCA AGTAAGTTAG TTACTCTTGA TTTTTATGTG GCATTTTACT 2100 
360 

361 CTTTTGTCTT TAATCGTTTT TTTAACCTTG TTTTCTCAGG CAGCCCTAGA GGCACCTAAA 2160 
362 

363 GTTTTGAATT TGAACAGTAG CAACTACTTC TCAGGACCAT ATGGTAATTA ACACATCCTA 2220 
364 

365 GTTTCAGAAA ACTCCTTACT ATATCATTGT AGGTAATCAT CTTTATTTTG CCTATTCCTG 2280 
366 

367 CAGGAGAGGA TGTTCTCTTC ATTGCCAATG ATTGGCACAC AGCTCTCATT CCTTGCTACT 2340 
368 

369 TGAAGTCAAT GTACCAGTCC AGAGGAATCT ACTTGAATGC CAAGGTAAAA TTTCTTTGTA 2400 
370 

371 TTCACTCGAT TGCACGTTAC CCTGCAAATC AGTAAGGTTG TATTAATATA TGATAAATTT 2460 
372 

3 73 CACATTGCCT CCAGGTTGCT TTCTGCATCC ATAACATTGC CTACCAAGGT CGATTTTCTT 252 0 
374 

3 75 TCTCTGACTT CCCTCTTCTC AATCTTCCTG ATGAATTCAG GGGTTCTTTT GATTTCATTG 2580 
376 

377 ATGGGTATGT ATTTATGCTT GAAATCAGAC CTCCAACTTT TGAAGCTCTT TTGATGCTAG 264 0 
378 

3 79 TAAATTGAGT TTTTAAAATT TTGCAGATAT GAGAAGCCTG TTAAGGGTAG GAAAATCAAC 2 700 
380 

381 TGGATGAAGG CTGGGATATT AGAATCACAT AGGGTGGTTA CAGTGAGCCC ATACTATGCC 2 760 
382 

383 CAAGAACTTG TCTCTGCTGT TGACAAGGGA GTTGAATTGG ACAGTGTCCT TCGTAAGACT 282 0 
384 

385 TGCATAACTG GGATTGTGAA TGGCATGGAT ACACAAGAGT GGAACCCAGC GACTGACAAA 288 0 
386 

38 7 TACACAGATG TCAAATACGA TATAACCACT GTAAGATAAG ATTTTTCCGA CTCCAGTATA 2940 
388 

389 TACTAAATTA TTTTGTATGT TTATGAAATT AAAGAGTTCT TGCTAATCAA AATCTCTATA 3 000 
390 

391 CAGGTCATGG ACGCAAAACC TTTACTAAAG GAGGCTCTTC AAGCAGCAGT TGGCTTGCCT 3 060 
392 

3 93 GTTGACAAGA AGATCCCTTT GATTGGCTTC ATCGGCAGAC TTGAGGAGCA GAAAGGTTCA 312 0 
394 

395 GATATTCTTG TTGCTGCAAT TCACAAGTTC ATCGGATTGG ATGTTCAAAT TGTAGTCCTT 3180 
396 

397 GTAAGTACCA AATGGACTCA TGGTATCTCT CTTGTTGAGT TTACTTGTGC CGAAACTGAA 324 0 
398 

3 99 ATTGACCTGC TACTCATCCT ATGCATCAGG GAACTGGCAA AAAGGATTTT GAGCAGGAGA 3300 
400 

401 TTGAACAGCT CGAAGTGTTG TACCCTAACA AAGCTAAAGG AGTGGCAAAA TTCAATGTCC 3360 
402 

403 CTTTGGCTCA CATGATCACT GCTGGTGCTG ATTTTATGTT GGTTCCAAGC AGATTTGAAC 342 0 
404 

405 CTTGTGGTCT CATTCAGTTA CATGCTATGC GATATGGAAC AGTAAGAACC AGAAGAGCTT 3480 
406 

407 GTACCTTTTT ACTGAGTTTT TAAAAAAAGA ATCATAAGAC CTTGTTTTCC ATCTAAAGTT 3540 
408 
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409 TAATAACCAA CTAAATGTTA CTGCAGCAAG CTTTTCATTT CTGAAAATTG GTTATCTGAT 
410 

411 TTTAACGTAA TCACATGTGA GTCAGGTACC AATCTGTGCA TCGACTGGTG GACTTGTTGA 
412 

413 CACTGTGAAA GAAGGCTATA CTGGATTCCA TATGGGAGCC TTCAATGTTG AAGTATGTGA 
414 

415 TTTTACATCA ATTGTGTACT TGTACATGGT CCATTCTCGT CTTGATATAC CCCTTGTTGC 
416 

417 ATAAACATTA ACTTATTGCT TCTTGAATTT GGTTAGTGCG ATGTTGTTGA CCCAGCTGAT 
418 

419 GTGCTTAAGA TAGTAACAAC AGTTGCTAGA GCTCTTGCAG TCTATGGCAC CCTCGCATTT 
420 

421 GCTGAGATGA TAAAAAATTG CATGTCAGAG GAGCTCTCCT GGAAGGTAAG TGTGAATTTG 
422 

423 ATAATTTGCG TAGGTACTTC AGTTTGTTGT TCTCGTCAGC ACTGATGGAT TCCAACTGGT 
424 

425 GTTCTTGCAG GAACCTGCCA AGAAATGGGA GACATTGCTA TTGGGCTTAG GAGCTTCTGG 
426 

427 CAGTGAACCC GGTGTTGAAG GGGAAGAAAT CGCTCCACTT GCCAAGGAAA ATGTAGCCAC 
428 

429 TCCTTAAATG AGCTTTGGTT ATCCTTGTTT CAACAATAAG ATCATTAAGC AAACGTATTT 
430 

431 ACTAGCGAAC TATGTAGAAC CCTATTATGG GGTCTCAATC ATCTACAAAA TGATTGGTTT 
432 

433 TTGCTGGGGA GCAGCAGCAT ATAAGGCTGT AAAATCCTGG TTAATGTTTT TGTAGGTAAG 
434 

435 GGCTATTTAA GGTGGTGTGG ATCAAAGTCA ATAGAAAATA GTTATTACTA ACGTTTGCAA 
436 

43 7 CTAAATACTT AGTAATGTAG CATAAATAAT ACTAGAACTA GTAGCTAATA TATATGCGTG 
438 

43 9 AATTTGTTGT ACCTTTTCTT GCATAATTAT TTGCAGTACA TATATAATGA AAATTAC CCA 
440 

441 AGGAATCAAT GTTTCTTGCT CCGTCCTCCT TTGATGATTT TTTACGCAAT ACAGAGCTAG 
442 

443 TGTGTTATGT TATAAATTTT GTTTAAAAGA AGTAATCAAA TTCAAATTAG TTGTTTGGTC 
444 

445 ATATGAAAGA AGCTGCCAGG CTAACTTTGA GGAGATGGCT ATTGAATTTC AAAATGATTA 
446 

44 7 TGTGAAAACA ATGCAACATC TATGTCAATC AACACTTAAA TTATTGCATT TAGAAAGATA 
448 

449 TTTTTGAGCC CATGACACAT TCATTCATAA AGTAAGGTAG TATGTATGAT TGAATGGACT 
450 

451 ACAGCTCAAT CAAAGCATCT C CTTT ACATA ACGGCACTGT CTCTTGTCTA CTACTCTATT 
452 

453 GGTAGTAGTA GTAGTAATTT TACAATCCAA ATTGAATAGT AATAAGATGC TCTCTATTTA 
454 

4 55 CTAAAGTAGT AGTATTATTC TTTCGTTACT CTAAAGCAAC AAAA 



3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4964 



456 
457 
458 
459 



(2) INFORMATION FOR SEQ ID NO: 6: 
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INPUT SET; S2078.raw 

460 (i) SEQUENCE CHARACTERISTICS: 

461 (A) LENGTH: 69 amino acids 

462 (B) TYPE: amino acid 

463 (C) STRANDEDNESS : single 

464 ( D ) TOPOLOGY : 1 inear 
465 

466 (ii) MOLECULE TYPE: peptide 

467 

468 (ix) FEATURE: 

469 (A) NAME /KEY : Modif ied-site 

470 (B) LOCATION: 1..69 

471 (D) OTHER INFORMATION: /note= "Amino acid sequence encoded 

472 by nucleotides 1-207 of SEQ ID NO. 2." 
473 

474 

475 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

476 

477 Asn Lys Leu Asp Gly Leu Gin Ser Thr Thr Asn Thr Lys Val Thr Pro 

478 1 5 10 15 
479 

480 Lys Met Ala Ser Arg Thr Glu Thr Lys Arg Pro Gly Cys Ser Ala Thr 

481 20 25 30 
482 

483 He Val Cys Gly Lys Gly Met Asn Leu He Phe Val Gly Thr Glu Val 

484 35 40 45 
485 

486 Gly Pro Trp Ser Lys Thr Gly Gly Leu Gly Asp Val Leu Gly Gly Leu 

487 50 55 60 
488 

489 Pro Pro Ala Leu Ala 

490 65 
491 

492 

493 (2) INFORMATION FOR SEQ ID NO: 7: 
494 

495 (i) SEQUENCE CHARACTERISTICS: 

4 96 (A) LENGTH: 27 amino acids 

497 (B) TYPE: amino acid 

498 (C) STRANDEDNESS: single 

499 ( D ) TOPOLOGY : 1 inear 
500 

501 (ii) MOLECULE TYPE: peptide 

502 

503 (ix) FEATURE: 

504 (A) NAME/KEY: Modif ied-site 

505 (B) LOCATION: 1. .27 

506 (D) OTHER INFORMATION: /note= "Amino acid sequence encoded 

507 by nucleotides 296-377 of SEQ ID NO. 2." 
508 

509 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

510 
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DATE: 12/10/93 
TIME: 12:00:48 



INPUT SET: S2078.raw 



511 Ala Arg Gly His Arg Val Met Thr lie Ser Pro Arg Tyr Asp Gin Tyr 

512 15 10 15 
513 

514 Lys Asp Ala Trp Asp Thr Gly Val Ala Val Glu 

515 20 25 
516 

517 

518 (2) INFORMATION FOR SEQ ID NO: 8: 
519 

520 (i) SEQUENCE CHARACTERISTICS: 

521 (A) LENGTH: 33 amino acids 

522 (B) TYPE: amino acid 

523 (C) STRANDEDNESS : single 

524 (D) TOPOLOGY: linear 
525 

526 (ii) MOLECULE TYPE: peptide 

527 

528 (ix) FEATURE: 

529 (A) NAME /KEY : Modif ied-site 

530 (B) LOCATION: 1..33 

531 (D) OTHER INFORMATION: /note= "Amino acid sequence encoded 

532 by nucleotides 452-550 of SEQ ID NO. 2." 
533 

534 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: 

535 

536 Val Lys Val Gly Asp Ser He Glu He Val Arg Phe Phe His Cys Tyr 

537 1 5 10 15 
538 

53 9 Lys Arg Gly Val Asp Arg Val Phe Val Asp His Pro Met Phe Leu Glu 
540 20 25 30 

541 

542 Lys 

543 

544 

54 5 (2) INFORMATION FOR SEQ ID NO : 9 : 
546 

547 (i) SEQUENCE CHARACTERISTICS: 

548 (A) LENGTH: 30 amino acids 

549 (B) TYPE: amino acid 

550 (C) STRANDEDNESS: single 

551 (D) TOPOLOGY: linear 
552 

553 (ii) MOLECULE TYPE: peptide 

554 

555 (ix) FEATURE: 

556 (A) NAME/KEY: Modif ied-site 

557 (B) LOCATION: 1..30 

558 (D) OTHER INFORMATION: /note= "Amino acid sequence encoded 

559 by nucleotides 647-736 of SEQ ID NO. 2." 
560 

561 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
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DATE: 12/10/93 
TIME: 12:00:59 



INPUT SET: S2078.raw 



562 

563 Val Trp Gly Lys Thr Gly Ser Lys lie Tyr Gly Pro Lys Ala Gly Leu 

564 15 10 15 
565 

566 Asp Tyr Leu Asp Asn Glu Leu Arg Phe Ser Leu Leu Cys Gin 

567 20 25 30 
568 

569 

570 (2) INFORMATION FOR SEQ ID NO: 10: 
571 

572 (i) SEQUENCE CHARACTERISTICS: 

573 (A) LENGTH: 21 amino acids 

574 (B) TYPE: amino acid 

575 (C) STRANDEDNESS : single 

576 (D) TOPOLOGY: linear 
577 

578 (ii) MOLECULE TYPE: peptide 

579 

580 (ix) FEATURE: 

581 (A) NAME/KEY: Modif ied-site 

582 (B) LOCATION: 1..21 

583 (D) OTHER INFORMATION: /note= "Amino acid sequence encoded 

584 by nucleotides 815-878 of SEQ ID NO. 2." 
585 

586 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

587 

588 Ala Ala Leu Glu Ala Pro Lys Val Leu Asn Leu Asn Ser Ser Asn Tyr 

589 15 10 15 
590 

591 Phe Ser Gly Pro Tyr 

592 20 
593 

594 

595 (2) INFORMATION FOR SEQ ID NO: 11: 
596 

597 (i) SEQUENCE CHARACTERISTICS: 

598 (A) LENGTH: 34 amino acids 

599 (B) TYPE: amino acid 

600 (C) STRANDEDNESS: single 

601 (D) TOPOLOGY: linear 
602 

603 (ii) MOLECULE TYPE: peptide 

604 

605 (ix) FEATURE: 

606 (A) NAME/KEY: Modif ied- site 

607 (B) LOCATION: 1. .34 

608 (D) OTHER INFORMATION: /note= "Amino acid sequence encoded 

609 by nucleotides 878 and 959-1059 of SEQ ID NO. 2." 
610 

611 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

612 
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INPUT SET; S2078.raw 

613 Gly Glu Asp Val Leu Phe lie Ala Asn Asp Trp His Thr Ala Leu lie 

614 15 10 15 
615 

616 Pro Cys Tyr Leu Lys Ser Met Tyr Gin Ser Arg Gly lie Tyr Leu Asn 

617 20 25 30 
618 

619 Ala Lys 

620 

621 

622 (2) INFORMATION FOR SEQ ID NO:12: 
623 

624 (i) SEQUENCE CHARACTERISTICS: 

625 (A) LENGTH: 38 amino acids 

626 (B) TYPE: amino acid 

627 (C) STRANDEDNESS: single 

628 (D) TOPOLOGY: linear 
629 

630 (ii) MOLECULE TYPE: peptide 

631 

632 (ix) FEATURE: 

633 (A) NAME /KEY : Modif ied-site 

634 (B) LOCATION: 1. .38 

635 (D) OTHER INFORMATION: /note^ "Amino acid sequence encoded 

636 by nucleotides 1150-1263 of SEQ ID NO 2." 
637 

638 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12: 

639 

640 Val Ala Phe Cys lie His Asn lie Ala Tyr Gin Gly Arg Phe Ser Phe 

641 15 10 15 
642 

643 Ser Asp Phe Pro Leu Leu Asn Leu Pro Asp Glu Phe Arg Gly Ser Phe 

644 20 25 30 
645 

646 Asp Phe lie Asp Gly Tyr 

647 35 
648 

649 

650 (2) INFORMATION FOR SEQ ID NO: 13: 
651 

652 (i) SEQUENCE CHARACTERISTICS: 

653 (A) LENGTH: 79 amino acids 

654 (B) TYPE: amino acid 

655 (C) STRANDEDNESS: single 

656 (D) TOPOLOGY: linear 
657 

658 (ii) MOLECULE TYPE: peptide 

659 

660 (ix) FEATURE: 

661 (A) NAME/KEY: Modif ied-site 

662 (B) LOCATION: 1..79 

663 (D) OTHER INFORMATION: /note= "Amino acid sequence encoded 
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INPUT SET: S2078.raw 

664 by nucleotides 1349-1585 of SEQ ID NO 2." 

665 

666 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

667 

668 Lys Pro Val Lys Gly Arg Lys lie Asn Trp Met Lys Ala Gly lie Leu 

669 15 10 15 
670 

671 Glu Ser His Arg Val Val Thr Val Ser Pro Tyr Tyr Ala Gin Glu Leu 

672 20 25 30 
673 

674 Val Ser Ala Val Asp Lys Gly Val Glu Leu Asp Ser Val Leu Arg Lys 

675 35 40 45 
676 

677 Thr Cys lie Thr Gly He Val Asn Gly Met Asp Thr Gin Glu Trp Asn 

678 50 55 60 
679 

680 Pro Ala Thr Asp Lys Tyr Thr Asp Val Lys Tyr Asp He Thr Thr 

681 65 70 75 
682 

683 

684 (2) INFORMATION FOR SEQ ID NO : 14 : 
685 

686 (i) SEQUENCE CHARACTERISTICS: 

687 (A) LENGTH: 59 amino acids 

688 (B) TYPE: amino acid 

689 (C) STRANDEDNESS: single 

690 ( D ) TOPOLOGY : 1 inear 
691 

692 (ii) MOLECULE TYPE: peptide 

693 

694 (ix) FEATURE: 

695 (A) NAME /KEY: Modif ied-site 

696 (B) LOCATION: 1..59 

697 (D) OTHER INFORMATION: /note= "Amino acid sequence encoded 

698 by nucleotides 1676-1855 of SEQ ID NO 2." 
699 

700 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: 

701 

702 Val Met Asp Ala Lys Pro Leu Leu Lys Glu Ala Leu Gin Ala Ala Val 

703 1 5 10 15 
704 

705 Gly Leu Pro Val Asp Lys Lys He Pro Leu He Gly Phe He Gly Arg 

706 20 25 30 
707 

708 Leu Glu Glu Gin Lys Gly Ser Asp lie Leu Ala Val Ala lie His Lys 

709 35 40 45 
710 

711 Phe He Gly Leu Asp Val Gin He Val Val Leu 

712 50 55 
713 

714 



PAGE: 15 



RAW SEQUENCE LISTING 

PATENT APPLICATION US/08/070,455 



DATE: 12/10/93 
TIME: 12:01:32 



INPUT SET: S2078.raw 



715 (2) INFORMATION FOR SEQ ID NO: 15: 
716 

717 (i) SEQUENCE CHARACTERISTICS: 

718 (A) LENGTH: 64 amino acids 

719 (B) TYPE: amino acid 

720 (C) STRANDEDNESS: single 

721 (D) TOPOLOGY: linear 
722 

723 (ii) MOLECULE TYPE: peptide 

724 

725 (ix) FEATURE: 

726 (A) NAME /KEY : Modif ied-site 

727 (B) LOCATION: 1 . . 64 

728 (D) OTHER INFORMATION: /note= "Amino acid sequence encoded 

729 by nucleotides 1945-2136 of SEQ ID NO 2." 
730 

731 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:15: 

732 

733 Gly Thr Gly Lys Lys Glu Phe Glu Gin Glu lie Glu Gin Leu Glu Val 

734 15 10 15 
735 

736 Leu Tyr Pro Asn Lys Ala Lys Gly Val Ala Lys Phe Asn Val Pro Leu 

737 20 25 30 
738 

73 9 Ala His Met lie Thr Ala Gly Ala Asp Phe Met Leu Val Pro Ser Arg 
740 35 40 45 

741 

742 Phe Glu Pro Cys Gly Leu lie Gin Leu His Ala Met Arg Tyr Gly Thr 

743 50 55 60 
744 

745 

746 (2) INFORMATION FOR SEQ ID NO:16: 
747 

748 (i) SEQUENCE CHARACTERISTICS: 

74 9 (A) LENGTH: 29 amino acids 

750 (B) TYPE: amino acid 

751 (C) STRANDEDNESS: single 

752 (D) TOPOLOGY: linear 
753 

754 (ii) MOLECULE TYPE: peptide 

755 

756 (ix) FEATURE: 

757 (A) NAME /KEY : Modif ied-site 

758 (B) LOCATION: 1..29 

759 (D) OTHER INFORMATION: /note= "Amino acid sequence encoded 

760 by nucleotides 2301-2386 of SEQ ID NO 2." 
761 

762 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 16 : 

763 

764 Val Pro lie Cys Ala Ser Thr Gly Gly Leu Val Asp Thr Val Lys Glu 

765 1 5 10 15 
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INPUT SET: S2078.raw 



766 

767 Gly Tyr Thr Gly Phe His Met Gly Ala Phe Asn Val Glu 

768 20 25 
769 

770 

771 (2) INFORMATION FOR SEQ ID NO: 17: 
772 

773 (i) SEQUENCE CHARACTERISTICS: 

774 (A) LENGTH: 19 amino acids 

775 (B) TYPE: amino acid 

776 (C) STRANDEDNESS : single 

777 (D) TOPOLOGY: linear 
778 

779 (ii) MOLECULE TYPE: peptide 

780 

781 (ix) FEATURE: 

782 (A) NAME/KEY: Modif ied-site 

783 (B) LOCATION: 1 . . 19 

784 (D) OTHER INFORMATION: /note= "Amino acid sequence encoded 

785 by nucleotides 2492-2459 of SEQ ID NO 2." 
786 

787 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

788 

789 Cys Asp Val Val Asp Pro Ala Asp Val Leu Lys lie Val Thr Thr Val 

790 1 5 10 15 
791 

792 Ala Arg Ala 

793 

794 

795 (2) INFORMATION FOR SEQ ID NO: 18: 
796 

797 (i) SEQUENCE CHARACTERISTICS: 

798 (A) LENGTH: 111 amino acids 

799 (B) TYPE: amino acid 

800 (C) STRANDEDNESS: single 

801 (D) TOPOLOGY: linear 
802 

803 (ii) MOLECULE TYPE: peptide 

804 

805 ( ix) FEATURE : 

806 (A) NAME/KEY: Modif ied-site 

807 (B) LOCATION: 1 . . Ill 

808 (D) OTHER INFORMATION: /note= "Amino acid sequence encoded 

809 by nucleotides 1200-1532 of SEQ ID NO 5." 
810 

811 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

812 

813 Met Ala Ser He Thr Ala Ser His His Phe Val Ser Arg Ser Gin Thr 

814 15 10 15 
815 

816 Ser Leu Asp Thr Lys Ser Thr Leu Ser Gin He Gly Leu Arg Asn His 



3^ 
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DATE: 12/10/93 
TIME: 12:01:54 



INPUT SET: S2078.raw 



817 20 25 30 

818 

819 Thr Leu Thr His Asn Gly Leu Arg Ala Val Asn Lys Leu Asp Gly Leu 

820 35 40 45 
821 

822 Gin Ser Thr Thr Asn Thr Lys Val Thr Pro Lys Met Ala Ser Arg Thr 

823 50 55 60 
824 

825 Glu Thr Lys Arg Pro Gly Cys Ser Ala Thr lie Val Cys Gly Lys Gly 

826 65 70 75 80 
827 

828 Met Asn Leu lie Phe Val Gly Thr Glu Val Gly Pro Trp Ser Lys Thr 

829 85 90 95 
830 

831 Gly Gly Leu Gly Asp Val Leu Gly Gly Leu Pro Pro Ala Leu Ala 

832 100 105 110 
833 

834 

835 (2) INFORMATION FOR SEQ ID NO: 19: 
836 

837 (i) SEQUENCE CHARACTERISTICS: 

838 (A) LENGTH: 43 amino acids 

839 (B) TYPE: amino acid 

840 (C) STRANDEDNESS: single 

841 (D) TOPOLOGY: linear 
842 

843 (ii) MOLECULE TYPE: peptide 

844 

845 (ix) FEATURE: 

846 (A) NAME/KEY: Modif ied-site 

847 (B) LOCATION: 1..43 

848 (D) OTHER INFORMATION: /note= "Amino acid sequence encoded 

849 by nucleotides 3817-3945 of SEQ ID NO. 5." 
850 

851 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

852 

8 53 Cys Asp Val Val Asp Pro Ala Asp Val Leu Lys He Val Thr Thr Val 

854 1 5 10 15 

855 

856 Ala Arg Ala Leu Ala Val Tyr Gly Thr Leu Ala Phe Ala Glu Met He 

857 20 25 30 
858 

859 Lys Asn Cys Met Ser Glu Glu Leu Ser Trp Lys 

860 35 40 
861 

862 (2) INFORMATION FOR SEQ ID NO: 20: 
863 

864 (i) SEQUENCE CHARACTERISTICS: 

865 (A) LENGTH: 38 amino acids 

866 (B) TYPE: amino acid 

867 (C) STRANDEDNESS: single 



3*7 
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INPUT SET: S2078.raw 

868 ( D ) TOPOLOGY : 1 inear 

869 

870 (ii) MOLECULE TYPE: peptide 

871 

872 (ix) FEATURE: 

873 (A) NAME /KEY : Modif ied-site 

874 (B) LOCATION: 1..38 

875 (D) OTHER INFORMATION: /note= "Amino acid sequence encoded 

876 by nucleotides 4031-4144 of SEQ ID NO. 5." 
877 

878 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 

879 

880 Glu Pro Ala Lys Lys Trp Glu Thr Leu Leu Leu Gly Leu Gly Ala Ser 

881 15 10 15 
882 

883 Gly Ser Glu Pro Gly Val Glu Gly Glu Glu lie Ala Pro Leu Ala Lys 

884 20 25 30 
885 

886 Glu Asn Val Ala Thr Pro 

887 35 
888 

889 

890 (2) INFORMATION FOR SEQ ID NO: 21: 
891 

892 (i) SEQUENCE CHARACTERISTICS: 

893 (A) LENGTH: 17 base pairs 

894 (B) TYPE: nucleic acid 

895 (C) STRANDEDNESS: single 

896 (D) TOPOLOGY: linear 
897 

898 (ii) MOLECULE TYPE: RNA 

899 

900 (ix) FEATURE: 

901 (A) NAME/KEY: misc_RNA 

902 (B) LOCATION: 1 

903 (D) OTHER INFORMATION: /note= "Nucleotide 1 is a 7-methyl 

904 guanine added by 5' -5' linkage as an RNA cap." 
905 

906 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

907 

908 GAUGGCAAGA AAAAAAA 
909 
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